Exploring goodness of prosody by diverse matching templates
نویسندگان
چکیده
In automatic speech grading systems, rare research is followed through addressing the issue of GOR (Goodness Of pRosody). In this paper we propose a novel method by taking the advantage of our QBH (Query By Humming) techniques in 2008 MIREX evaluation task. A set of standard samples related to the top-cream students are initially picked up as templates, a cascade QBH structure is then taken from two metrics: the MOMEL stylization followed by DTW distance; the Fujisaki model followed by EMD distance. Sentence GOR is obtained by the fused confidence between target and each template, and forms a weighted sum as the goodness in the passage level. Experiment results indicate that performance increases with the count of template, and Fujisaki-EMD metric outperforms MOMEL-DTW one in terms of correlation. Their combination can be treated as template based GOR score, compensated with our previous feature based GOR score, the approach can achieve 0.432 in correlation and 17.90% in EER in our corpus.
منابع مشابه
Information extraction and text generation of news reports for a Swedish-English bilingual spoken dialogue system
This paper describes an experimental dialog system designed to retrieve information and generate summaries of internet news reports related to user queries in Swedish and English. The extraction component is based on parsing and on matching the parsing output against stereotypic event templates. Bilingual text generation is accomplished by filling the templates after which grammar components ge...
متن کاملCrafting the Illusion of Meaning: Template-Based Specification of Embodied Conversational Behavior
Templates are a widespread natural language technology that achieves believability within a narrow range of interaction and coverage. We consider templates for embodied conversational behavior. Such templates combine a specific pattern of marked-up text, specifying prosody and conversational signals as well as words, with similarly-annotated gaps that can be filled in by rule to yield a coheren...
متن کاملTraining prosodic phrasing rules for Chinese TTS systems
This paper describes several experiments designed to train prosodic phrasing models for Chinese TTS systems and to investigate the underlying rules that control Chinese prosody. First, we collected 559 sentences from news programs and built a large corpus for modeling Chinese prosody. Second, we selected 20 features and used classification and regression trees (CART) and transformational rule-b...
متن کاملStem-ML: language-independent prosody description
Stem-ML is a tagging system with a completely defined algorithm for translating the tags into quantitative prosody in any language. It separates the description of prosodic intentions from their execution, by modeling the interactions between accents. We designed Stem-ML to allow automated training of accent shapes and parameters from acoustic databases. Stem-ML is linguistically neutral: it al...
متن کاملIntegrating rule and template-based approaches for emotional Malay speech synthesis
The manipulation of prosody, including pitch, duration and intensity, is one of the leading approaches in synthesizing emotion. This paper reports work on the development of a Malay Emotional synthesizer capable of expressing four basic emotions, namely happiness, anger, sadness and fear for any form of text input with various intonation patterns using the prosody manipulation principle. The sy...
متن کامل